NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Diffusion Models With Learned Adaptive Noise

Sahoo, Subham; Gokaslan, Aaron; De_Sa, Christopher; Kuleshov, Volodymyr (December 2024, NeurIPS 2024)

Diffusion models have gained traction as powerful algorithms for synthesizing high-quality images. Central to these algorithms is the diffusion process, a set of equations which maps data to noise in a way that can significantly affect performance. In this paper, we explore whether the diffusionprocess can be learned from data.Our work is grounded in Bayesian inference and seeks to improve log-likelihood estimation by casting the learned diffusion process as an approximate variational posterior that yields a tighter lower bound (ELBO) on the likelihood.A widely held assumption is that the ELBO is invariant to the noise process: our work dispels this assumption and proposes multivariate learned adaptive noise (MuLAN), a learned diffusion process that applies noise at different rates across an image. Our method consists of three components: a multivariate noise schedule, adaptive input-conditional diffusion, and auxiliary variables; these components ensure that the ELBO is no longer invariant to the choice of the noise schedule as in previous works. Empirically, MuLAN sets a new state-of-the-art in density estimation on CIFAR-10 and ImageNet while matching the performance of previous state-of-the-art models with 50% fewer steps. We provide the code, along with a blog post and video tutorial on the project page: https://s-sahoo.com/MuLAN
more » « less
Full Text Available
Quip#: Even better LLM quantization with hadamard incoherence and lattice codebooks.

Tseng, Albert; Chee, Jerry; Sun, Qingyao; Kuleshov, Volodymyr; De_Sa, Christopher (July 2024, ICML)

Full Text Available
Shadow Cones: A Generalized Framework for Partial Order Embeddings

Yu, Tao; Liu, Toni; Tseng, Albert; De_Sa, Christopher (May 2024, ICLR 2024)

Hyperbolic space has proven to be well-suited for capturing hierarchical relations in data, such as trees and directed acyclic graphs. Prior work introduced the concept of entailment cones, which uses partial orders defined by nested cones in the Poincar'e ball to model hierarchies. Here, we introduce the ``shadow cones" framework, a physics-inspired entailment cone construction. Specifically, we model partial orders as subset relations between shadows formed by a light source and opaque objects in hyperbolic space. The shadow cones framework generalizes entailment cones to a broad class of formulations and hyperbolic space models beyond the Poincar'e ball. This results in clear advantages over existing constructions: for example, shadow cones possess better optimization properties over constructions limited to the Poincar'e ball. Our experiments on datasets of various sizes and hierarchical structures show that shadow cones consistently and significantly outperform existing entailment cone constructions. These results indicate that shadow cones are an effective way to model partial orders in hyperbolic space, offering physically intuitive and novel insights about the nature of such structures.
more » « less
Full Text Available
Modulora: Finetuning 3-bit llms on consumer gpus by integrating with modular quantizers

Yin, Junjie; Dong, Jiahao; Wang, Yingheng; De_Sa, Christopher; Kuleshov, Volodymyr (January 2024, TMLR)
QuIP: 2-bit quantization of large language models with guarantees

Chee, Jerry; Cai, Yaohui; Kuleshov, Volodymyr; De_Sa, Christopher M (December 2023, Neurips)
Neural Caches for Monte Carlo Partial Differential Equation Solvers

https://doi.org/10.1145/3610548.3618141

Li, Zilu; Yang, Guandao; Deng, Xi; De_Sa, Christopher; Hariharan, Bharath; Marschner, Steve (December 2023, ACM)

This paper presents a method that uses neural networks as a caching mechanism to reduce the variance of Monte Carlo Partial Differential Equation solvers, such as the Walk-on-Spheres algorithm [Sawhney and Crane 2020]. While these Monte Carlo PDE solvers have the merits of being unbiased and discretization-free, their high variance often hinders real-time applications. On the other hand, neural networks can approximate the PDE solution, and evaluating these networks at inference time can be very fast. However, neural-network-based solutions may suffer from convergence difficulties and high bias. Our hybrid system aims to combine these two potentially complementary solutions by training a neural field to approximate the PDE solution using supervision from a WoS solver. This neural field is then used as a cache in the WoS solver to reduce variance during inference. We demonstrate that our neural field training procedure is better than the commonly used self-supervised objectives in the literature. We also show that our hybrid solver exhibits lower variance than WoS with the same computational budget: it is significantly better for small compute budgets and provides smaller improvements for larger budgets, reaching the same performance as WoS in the limit.
more » « less
Full Text Available
Arbitrariness and Social Prediction: The Confounding Role of Variance in Fair Classification

https://doi.org/10.1609/aaai.v38i20.30203

Cooper, A_Feder; Lee, Katherine; Choksi, Madiha Zahrah; Barocas, Solon; De_Sa, Christopher; Grimmelmann, James; Kleinberg, Jon; Sen, Siddhartha; Zhang, Baobao (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

Variance in predictions across different trained models is a significant, under-explored source of error in fair binary classification. In practice, the variance on some data examples is so large that decisions can be effectively arbitrary. To investigate this problem, we take an experimental approach and make four overarching contributions. We: 1) Define a metric called self-consistency, derived from variance, which we use as a proxy for measuring and reducing arbitrariness; 2) Develop an ensembling algorithm that abstains from classification when a prediction would be arbitrary; 3) Conduct the largest to-date empirical study of the role of variance (vis-a-vis self-consistency and arbitrariness) in fair binary classification; and, 4) Release a toolkit that makes the US Home Mortgage Disclosure Act (HMDA) datasets easily usable for future research. Altogether, our experiments reveal shocking insights about the reliability of conclusions on benchmark datasets. Most fair binary classification benchmarks are close-to-fair when taking into account the amount of arbitrariness present in predictions -- before we even try to apply any fairness interventions. This finding calls into question the practical utility of common algorithmic fairness methods, and in turn suggests that we should reconsider how we choose to measure fairness in binary classification.
more » « less
Full Text Available
InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models

Wang, Yingheng; Schiff, Yair; Gokaslan, Aaron; Pan, Weishen; Wang, Fei; De_Sa, Christopher; Kuleshov, Volodymyr (July 2023, International Conference on Machine Learning)

Search for: All records